63 research outputs found
Generative Knowledge Selection for Knowledge-Grounded Dialogues
Knowledge selection is the key in knowledge-grounded dialogues (KGD), which
aims to select an appropriate knowledge snippet to be used in the utterance
based on dialogue history. Previous studies mainly employ the classification
approach to classify each candidate snippet as "relevant" or "irrelevant"
independently. However, such approaches neglect the interactions between
snippets, leading to difficulties in inferring the meaning of snippets.
Moreover, they lack modeling of the discourse structure of dialogue-knowledge
interactions. We propose a simple yet effective generative approach for
knowledge selection, called GenKS. GenKS learns to select snippets by
generating their identifiers with a sequence-to-sequence model. GenKS therefore
captures intra-knowledge interaction inherently through attention mechanisms.
Meanwhile, we devise a hyperlink mechanism to model the dialogue-knowledge
interactions explicitly. We conduct experiments on three benchmark datasets,
and verify GenKS achieves the best results on both knowledge selection and
response generation.Comment: Findings of EACL-2
Towards Empathetic Dialogue Generation over Multi-type Knowledge
Enabling the machines with empathetic abilities to provide context-consistent
responses is crucial on both semantic and emotional levels. The task of
empathetic dialogue generation is proposed to address this problem. However,
lacking external knowledge makes it difficult to perceive implicit emotions
from limited dialogue history. To address the above challenges, we propose to
leverage multi-type knowledge, i.e, the commonsense knowledge and emotional
lexicon, to explicitly understand and express emotions in empathetic dialogue
generation. We first enrich the dialogue history by jointly interacting with
two-type knowledge and construct an emotional context graph. Then we introduce
a multi-type knowledge-aware context encoder to learn emotional context
representations and distill emotional signals, which are the prerequisites to
predicate emotions expressed in responses. Finally, we propose an emotional
cross-attention mechanism to exploit the emotional dependencies between the
emotional context graph and the target empathetic response. Conducted on a
benchmark dataset, extensive experimental results show that our proposed
framework outperforms state-of-the-art baselines in terms of automatic metrics
and human evaluations.Comment: arXiv admin note: text overlap with arXiv:1911.0869
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent
Large Language Models (LLMs) have demonstrated a remarkable ability to
generalize zero-shot to various language-related tasks. This paper focuses on
the study of exploring generative LLMs such as ChatGPT and GPT-4 for relevance
ranking in Information Retrieval (IR). Surprisingly, our experiments reveal
that properly instructed ChatGPT and GPT-4 can deliver competitive, even
superior results than supervised methods on popular IR benchmarks. Notably,
GPT-4 outperforms the fully fine-tuned monoT5-3B on MS MARCO by an average of
2.7 nDCG on TREC datasets, an average of 2.3 nDCG on eight BEIR datasets, and
an average of 2.7 nDCG on ten low-resource languages Mr.TyDi. Subsequently, we
delve into the potential for distilling the ranking capabilities of ChatGPT
into a specialized model. Our small specialized model that trained on 10K
ChatGPT generated data outperforms monoT5 trained on 400K annotated MS MARCO
data on BEIR. The code to reproduce our results is available at
www.github.com/sunnweiwei/RankGP
Towards Explainable Conversational Recommender Systems
Explanations in conventional recommender systems have demonstrated benefits
in helping the user understand the rationality of the recommendations and
improving the system's efficiency, transparency, and trustworthiness. In the
conversational environment, multiple contextualized explanations need to be
generated, which poses further challenges for explanations. To better measure
explainability in conversational recommender systems (CRS), we propose ten
evaluation perspectives based on concepts from conventional recommender systems
together with the characteristics of CRS. We assess five existing CRS benchmark
datasets using these metrics and observe the necessity of improving the
explanation quality of CRS. To achieve this, we conduct manual and automatic
approaches to extend these dialogues and construct a new CRS dataset, namely
Explainable Recommendation Dialogues (E-ReDial). It includes 756 dialogues with
over 2,000 high-quality rewritten explanations. We compare two baseline
approaches to perform explanation generation based on E-ReDial. Experimental
results suggest that models trained on E-ReDial can significantly improve
explainability while introducing knowledge into the models can further improve
the performance. GPT-3 in the in-context learning setting can generate more
realistic and diverse movie descriptions. In contrast, T5 training on E-ReDial
can better generate clear reasons for recommendations based on user
preferences. E-ReDial is available at https://github.com/Superbooming/E-ReDial
Learning to Ask Conversational Questions by Optimizing Levenshtein Distance
Conversational Question Simplification (CQS) aims to simplify self-contained
questions into conversational ones by incorporating some conversational
characteristics, e.g., anaphora and ellipsis. Existing maximum likelihood
estimation (MLE) based methods often get trapped in easily learned tokens as
all tokens are treated equally during training. In this work, we introduce a
Reinforcement Iterative Sequence Editing (RISE) framework that optimizes the
minimum Levenshtein distance (MLD) through explicit editing actions. RISE is
able to pay attention to tokens that are related to conversational
characteristics. To train RISE, we devise an Iterative Reinforce Training (IRT)
algorithm with a Dynamic Programming based Sampling (DPS) process to improve
exploration. Experimental results on two benchmark datasets show that RISE
significantly outperforms state-of-the-art methods and generalizes well on
unseen data.Comment: 13 pages, 4 figures, Published in ACL 202
- …